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1. INTRODUCTION 

The feature table contains information about genes and gene products, as well as regions of 
biological significance reported in a sequence. It contains information on regions of the sequence 
that code for proteins and RNA molecules. It also enumerates differences between different reports 
of the same sequence and provides cross-references to other data collections, as described in more 
detail below. 

The first two lines of the feature table in IMGT/LIGM-DB entries are feature header (FH) lines, 
specific to the EMBL flatfile format. The first one includes the column headers 'Key' and 
'Location/Qualifier'. The second one is an empty spacer line. 

Each feature consists of a feature key and a location (see below for details). If the location does not 
fit on the same line as the key, a continuation line may follow. If further information about the 
sequence is required, one or more additional lines containing feature qualifiers may follow. 

Features appear on FT lines. The linetype code FT appears in columns 1 -2 and columns 3-5 are 
blank. The feature key begins in column 6 and may be no more than 1 5 characters in length. The 
location begins in column 26. Feature qualifiers begin on subsequent FT lines at column 26. 
Location, qualifier, and continuation lines may extend from column 26 to 80. Each qualifier is added 
on a new line. 



2. FORMAT EXAMPLE 



An example of the feature table format is: 



+ 



+ 



+ 



10 



20 



30 4 0 

Location/Qualifiers 



50 



60 



70 



Key 



L- PARTI 
V-GENE 



1. .28 
1. .222 

/cell_type="B cell" 
/note="NCBI gi: 483900" 
/partial 

/product="iminunoglobulin kappa chain, V-region 



(SPK.4) " 

/tissue_type="Graves ' thyroid" 
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+ + + + + + + + + + + + + + 

10 20 30 40 50 60 70 

Thus, there are 4 types of feature table lines: 

Line type Content tf/entry #/feature 



Header Column titles 1 N/A 

Feature descriptor Key and location 1 to many 1 

Feature qualifiers Qualifiers and values N/A 0 to many 

Continuation lines Feature descriptor or 0 to many 0 to many 

qualifier continuation 



The position of the data items within the feature descriptor line is as 
follows : 



column position data item 



1-5 blank (may be used to improve readability, ie FT) 

6-24 feature key 

25 blank 

26-80 location 



Data on the qualifier and continuation lines begins in column position 26 (the first 25 columns 
contain blanks the first character is a V followed by the the qualifier discription). Qualifiers used 
here are the same as the EMBL qualifiers except for one exception the AA_number qualifier. 

The sections below provide a brief introduction to the new feature table format. 
3. FEATURE KEYS 



The first item on an FT line is the feature key. It starts in column 6 and can continue to column 24. 
The list of valid feature keys is shown below: 



Label name 


Definition 


(DJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one D-J- 
GENE and one C-GENE 


(DJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one D-J- 
GENE, one J-GENE and one C-GENE 


(DJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one D-J- 
GENE, and one J-GENE 


(VDJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V-D-J- 
GENE and one C-GENE 


(VDJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V-D-J- 
GENE, one J-GENE and one C-GENE 


(VDJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V-D-J- 
GENE and one J-GENE 


(VJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V-J- 
GENE and one C-GENE 


(VJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V-J- 
GENE, one J-GENE and one C-GENE 
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(VJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V-J- 
GENE and one J-GENE 


lst-CYS 


codon (3 nucleotides) for Cysteine in conserved position in FR1 


2nd-CYS 


codon (3 nucleotides) for Cysteine in conserved position in FR3 


3'D-HEPTAMER 


7 nucleotide recombination site like CACAGTG, part of a 3'D-RS 


3 D-NONAMbK 


y nucleotide recombination site like ACAAAAACL,, part 01 a 3 u-Kb 


3'D-RS 


recomDination signal including tne j u-ribr i ajvldk, j u-orA^iiK, 
and 3'D-NONAMER in 3'of the D-REGION of a D-GENE 


3'D-SPACER 


12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3T>- 
NONAMER of a 3'D-RS 


3 'UTR 


3' untranslated sequence, EMBL feature Key signification 


5 D-HbPI AMbR 


7 nucleotide recombination site like CAC IU I (j, part ot a j D-Ko 


5'D-NONAMER 


9 nucleotide recombination site like GGTTTTTGT, part of a 5'D-RS 


5 D-RS 


recombination signal including the 5'D-NONAMER, 5'D-SPACER and 
5 D-HEPTAMER in 5 or the D-RbulUN 01 a D-LrbNb, or in j 01 the 
D-REGION of D- J-GENE 


5'D-SPACER 


12 or 23 nucleotide spacer between the 5 D-HEr 1 AMbK and 5 D- 
NONAMER of a 5'D-RS 


5'UTR 


5' untranslated sequence, EMBL feature Key signification 


ACCEPTOR-SPLICE 


splicing site in 5' of coding region (nagnn), with splicing occurring after 
g 


C-CLUSIbR 


genomic DNA including more than one C-GENE 


C-GENE 


genomic DNA including C-KbCrlUIN (and IN IKUNs 11 present; witn j 
UTR and 3' UTR 


C-LIKE-DOMAIN 


coding region or non-lG ana non-1 R similar to an ICj or IK L- 
DOMAIN 


C-REGION 


coding region of C-GENE or corresponding region in cDNA 


C-SEQUENCE 


cDNA including C-REGION (and INTRONs for unspliced cDNA) with 
5' UTR and 3' UTR 


CAAT SIGNAL 


'CAAT box 1 in eukaryotic promoters, EMBL Feature Key signification 


CAP SITE 


m RNA cap site 


CDR1 


first complementarity determining region 


CDR1-IMGT 


first complementarity determining region according to the IMGT unique 
numbering 


CDR2 


second complementarity determining region 


CDR2-IMGT 


second complementarity determining region according to the IMGT 
unique numbering 


CDR3 


third complementarity determining region 


CDR3-IMGT 


third complementarity determining region according to the IMGT 
unique numbering 


CH-S 


3' end of CH3 or CH4 exon or independent exon which encodes the 
hydrophilic C-terminal end of soluble Ig, or corresponding region in 
cDNA 


CH-SD 


duplicated CH-S exon of IG heavy C-GENE (found in teleostei), or 
corresponding region in cDNA 
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CH-T 


small terminal exon in truncated heavy chain transcript resulting of 
alternative splicing 


CH-X 


unusual exon of Ig heavy C-GENE, or corresponding coding region in 

cDNA 


CHI 


first exon of Ig heavy C-GENE, or corresponding coding region in 
cDNA 


CHID 


duplicated CHI exon of IG heavy C-GENE (found in teleostei), or 
corresponding region in cDNA 


CH2 


second exon of Ig heavy C-GENE (or part of the second exon when 
hinge sequence belongs to the same exon), or corresponding coding 
region in cDNA 


CH2D 


duplicated CH2 exon of IG heavy C-GENE (found in teleostei), or 
corresponding region in cDNA 


CH3 


third exon of Ig heavy C-GENE (including CH-S if present), or 
corresponding coding region in cDNA 


CH3D 


duplicated CH3 exon of IG heavy C-GENE (found in teleostei), or 
corresponding region in cDNA 


CH4 


fourth exon of Ig heavy C-GENE (including CH-S if present), or 
corresponding coding region in cDNA 


CH4D 


duplicated CH4 exon of IG heavy C-GENE (found in teleostei), or 
corresponding region in cDNA 


CH5 


fifth exon of Ig heavy C-GENE, or corresponding coding region in 
CUNA 


CH6 


sixth exon of Ig heavy C-GENE, or corresponding coding region in 
cDNA 


CH7 


seventh exon of Ig heavy C-GENE, or corresponding coding region in 
cDNA 


CL 


exon of Ig light C-GENE, or corresponding coding region in cDNA 


CON r L1C 1 


independent determinations differ, EMBL Feature Key signification 


CONNECTING- 

KbUlUN 


coding region connecting the membrane proximal C-DOMAIN (or C- 
LlKb-DUMAlN ) ana tne 1 KANbMbJVu3KANb-K_iulUN 


CONSERVED-TRP 


codon (3 nucleotides) for Tryptophan in conserved position in FR2- 


CYTOPLASMIC- 

KbCjrlUN 


coding intracytoplasmic region 


D-(DJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one D- 
GENE, one D-J-GbNb ana one C-GbNb 


D-(DJ)-CLUSTER 


genomic DNA in rearranged configuration including at least one D- 
UbJNb ana one D-J-UbNb 


D-(DJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one D- 
ubNb, one D-J-UbNb, one J-UbNb ana one C-UbNb 


D-(DJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one D- 
LrbNb, one D-J-ObNb, ana one J -UbNb 


D-CLUSTER 


genomic DNA in germline configuration including more than one D- 
GENE 


D-GENE 


germline genomic DNA including D-REGION with 5' UTR and 3' UTR, 
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aiso oesignateo as jj-obUMbiN i 


D-J-C-CLUSTER 


genomic DNA in germline configuration including at least one D- 
ubiNb, one j-vjbJNb ana one c-ObiNb 


D-J-C-SEQUENCE 


partially rearranged cDNA including D-, J- and C- REGION with 
j uik ana j uik 


D-J-CLUSTER 


genomic uina in gerrnjine cuniiguraiion inciuaing ai icaM one u-\jc>iysz 
and one J-GENE 


D-J-GENE 


r\ t> t*t J a 1 1 * / »*AQT*t*Qr\<T^/1 rrp»rtr\mi^ HWA inr'li'iHiTir* T-T? P^^iTr^M with T 

paTUaiiy rcaiTdngcu genomic Lsivr\ including u-j -xvlzajiwin wiui j uii\ 

and 3' UTR, also designated as D-J-SEGMENT 


D-J-REGION 


coding region of D-J-GENE 


D-J-SEQUENCE 


partially rearranged cDNA including D- and J- REGION with 5'UTR 

„„ j -tit rrn 

ana j uik 


U-KhGlUN 


coding region of D-GENE (plus 1 or 2 nucleotide(s) after the 5'D- 
ribr Jt AMbK ana/or oeiore tne j Jj-n.br i amdk, 11 present ), or 
corresponding region in cDNA 


IJ-abv^UbNUb 


germline cuina inciuaing u-KbLiUJiN witn j uik ana j u i rv 


Dl -REGION 


coding region of the first D-GENE, when more than one D-GENE is 
invoivea in a juinl, nuiN, or corresponuing coaing region in curst\ 


D2-REGI0N 


coding region of the second D-GENE, when more than one D-GENE is 
invoivea m a juj.nl, i iutn, or corresponaing coaing region in cljin/\ 


D3 -REGION 


coding region 01 tne tnira u-udjNd, wnen more man one lv-ociNr, is 
involved in a JUNCTION, or corresponding coding region in cDNA 


DECAMER 


10 nucleotide regulation site or decanucleotide, includes OCTAMER, in 
the 5'UTR of a V-, V-D-, or V-D- J-GENE 


DELETION 


point out a deletion compared to other sequences 


DONOR-SPLICE 


splicing site in 3' of coding region (ngt), with splicing occurring before g 


DUPLICATION 


point out pattern duplication inside the sequence 


ENHANCER 


Cis-acting enhancer of promoter function, EMBL Feature Key 
signification 


EX1 


first exon of TcR C-GENE, or corresponding region in cDNA 




second exon of TcR C-GENE, or corresponding region in cDNA 


EX2A 


exon 2 A of TR C-GENE with exon 2 polymorphism by 
insertion/deletion or corresponding region in cDNA 


EX2B 


exon 2B of TR C-GENE with exon 2 polymorphism by 
insertion/deletion or corresponding region in cDNA 


EX2C 


exon 2C of TR C-GENE with exon 2 polymorphism by 
insertion/deletion or corresponding region in cDNA 


EX2R 


uupiicatea exon z or. numan ick ganima c-ociNfj,, or corresponaing 
region in cDNA 


EX2T 


triplicated exon 2 of human TcR gamma C-GENE, or corresponding 
region in cDNA 


bX3 


third exon ot lcR L-GbNb, or corresponaing region m cDNA 


EX4 


fourth exon of TcR C-GENE, or corresponding region in cDNA 


EXON 


exon of non Ig or non TcR genes, or corresponding coding region in 
cDNA 
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O 



o 



FR1 


first framework 


FR1-1MGT 


first framework according to the IMGT unique numbering 


FR2 


second framework 


FR2-IMGT 


second framework according to the IMGT unique numbering 


FR3 


third framework 


FR1-TMGT 

J IVJ 11V1VJ 1 


third frprn.PwnrV armrHino tn trip TA/ffrT linirmp mimriprinp 

U lluillu WUln. aVvul Villi & LvJ L11& 11V1VJ 1 LUllLJUV' 11 Lllll U\sl 111& 


FR4-IMGT 


fourth frfimpwrtrk ar.rorHino tn trip IlVfGT lininiip niirtiriPririP 

JLVJ \A1 111 11 allies VYU1 IV CIVWvJl ul illl LU tllV 1 1 v 1 V J 1 LllllUUls 11U1U y^l ***£5 


GENE 


eenomic DNA including EXONs and INTRONs with 5' UTR and 3' 

t^Vrl IV I * 11 v JL-M ^ lllwl Li VI 11 It* Jj/iV-/1 ^* O £+11 VI 11 ^ 1 Ivw 1 t O Willi J \J 1 IX {11 1U ~* 

UTR and corresponding unspliced and spliced cDNAs for non-IG and 
non-TR genes 


H 


hinge exon of Ig heavy C-GENE, or corresponding region in cDNA 


HI 


"firct Vimcrp ay/mi at 1 rt \~\ p*ri\r\; P-P-T^NTF 1 nr rT»ifpcnnni"iin(i rpoirtn in rllNA 
lllbl JliltgC CAU11 Ul Ig IlC«vy V_/"VJXj1NX_< ) Ul 1/UIICopUIlUlIlg wglUU 111 \sLSl^ir\ 


H2 


second hinge exon of Ig heavy C-GENE, or corresponding region in 
cDNA 


H3 


fhirr! hincrp p von ri"Fto Iipjiw P-rTpMT-'' nr rnrrp^nnnrl i n o rpcirin in 

UI11U llllic^& vAUll VJl lg HCaV ¥ \Jl_fl>IJ_>) Ul WUl 1 t/jUUHLlllig, ItglUll 111 

cDNA 


H4 


frmrth hincrp pvnn A"TTo hpsw P-fi-T^MF or rnrrp^noriHiriO' rpcnrtn in 

111 111 111 UlllE^t UI ig, IlC-ilVj' V_, " \J J_ 1 \ tjl W(J1 1 VoLJUlllilllg lw<glVJU 111 

cDNA 


H5 


fifth hinge exon of Ig heavy C-GENE, or corresponding region in cDNA 


HEPTANUCLEOTIDE 


7 nurlpntirlp rponlntinn citp lilrp PTPATPtP in S'TTTR nf 3 V- V-D- 

V-D-J-, or V-J-GENE 


HINGE-REGION 


coding region encoding the hinge in spliced cDNA 


1-bXUN 


non coding exon located upstream of the switch, or corresponding 
region in cDNA 


INDETERMINATION 


point out an indetermination for a pattern 


INIT-CODON 


initiation codon ATG 


INIT-CONS 


consensus sequence upstream the INIT-CODON 


INSERTION 


nrnnf fMit nn i r\ CPfti r\t\ r\T r»np r»t* mr\rp m \o 1 p r\tt /"I p c enimnarpn with rvln 
puilll UUl all UlbCl HUH Ul UliC Ul HiUlC IlULlCUuClCd CUIIljJcUCU Willi U1U 

release of the sequence or with a similar sequence 


UN 1 -U\Jl\VJJK-or _ll_.E- 


aiLcnidiivc uuiiur spucc biie lucaieu in a couing region 


INTERNAL- 


internal 7 nucleotide recombination site in V-REGION 


INTRON 


transcribed region excised by mRNA splicing, EMBL Feature Key 

ci rr 1 t 1 p* 1 1 r\n 
algllJllCaUUli 


J-C-CLUSTER 


^criurnio in gcniiiiiic LuniigurdLion lnLJuuing cii icasi one j-vju^injcj 

and one C-GENE 


J-C-INTRON 


nnn prvHino rpoinn riptwppn thp mri^t V T-frT-*T\TP finH trip frtllnwinc? (~*- 

11VJ1I t/VJUlllg IC-gl Wll UCIVVCCH IJluol J J ULinij CUIU Lilt 1U11U Wlllg 

GENE, or corresponding sequence in unspliced cDNA 


J-C-RFGION 


^rtHitirr rpoinn inpliiniTio T_ nnn I _ T?T-^nTr , l"M in ct*» 1 1 r* p H pHMA 
LUUllt^ ICglUil illHUUlXIg J CulU i VJD VJ I \J IN, Jll bpiiccu tUl N /V 


J-C-SEQUENCE 


germline cDNA including J- and C-REGION (J-C-REGION in spliced 

LUiNA., J-IvCvJlvJlN, J-v_.-J IN 1 KAJIN, dnu V^-]aJgvJ1 VJ1 > in UnSpilCcU \iUr*i\) 


J-CLUSTER 


genomic DNA in germline configuration including more than one J- 
GENE 


J-GENE 


germline genomic DNA including J-REGION with 5' UTR and 3' UTR, 
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also designated as J-SEGMENT 


J-HEPTAMER 


7 nucleotide recombination site, like CACAGTG, part of a J-RS 


J-NONAMER 


9 nucleotide recombination site, like GGTTTTTGT, part of a J-RS 


T_PHF 


cnn^prvpfl nhpnvlalaninp in T-PFGTOT^J of Jo licrht rhflin nr TrR 


J-REGION 


rnAino rpofon of T-frFMF friln*: 1 nr "? mirlpntiHpf' aftpr T- 

^UQlllH IClilUlI Ul J UUIiij I^IJlUo Jl \Jl llUHC/Ullvl^^O^ dlLVl J 

HEPTAMER, if present) or corresponding region in cDNA 


J-RS 


rprnmrii nation cicrnal inrliiHina T-WFPTAlVfFP T-SPAC'FR flnH T- 

1 G^UlllUliI&llUll ilg,lldl 111H UVJillg J I 1L/1 1 /AlVlUrlX, J Ol rvv/l^iv CUIvl J 

NONAMER in 5* of J-REGION of a J-GENE or J-SEQUENCE 


J-SEQUENCE 


germline cDNA including J-REGION with 5'UTR and 3'UTR 


J-SPACER 


1 0 or 0^ miflpotiflp Qnarpr hptwpprt thf* T~^JO*W A\/fPR Jin.fl thp T- 

1 £. Ul Z.J llUA^l&Ullllw iUOVtl UvlWCtll tilt .T IN V-/ 1 N / A j VI V ell IKJ. Lilt- J 

HEPTAMER of a J-RS 


T TRP 


fondari/^H tTx/ntorshsin in T_T?PlrT(Y\I ot Tcr hpq\ r\/ nncnn 
UUIlaCI VCU LiypLUJJIld.il 111 J -rVOVJIWiN Ul ig llCavjr Vsllalll 


JUNCTION 


coding region encompassing the V-J or V-D- J junction from 2nd CYS to 
the I-PHF nr T-TRP of the T-RFGTON 

LUC J*f n£ Ul J 1 I Vl Ul lilv J I Vi_/ vj I KJ 1 N 


L-INTRON-L 


sequence including L-PART1, V-INTRON and L-PART2, in genomic 

Lsly / V, Ul UUI ICoUUllUllig, oC\^UCilLC lit UllbpilUCU C VJ i N / V 


L-PART1 


avrtn f»nr"«oj"1in ft tVi£» riret r\i»rf ot th* 5 l^^H^i* r\^>nti/1P Ot O \/- \/-Ti- \/ T- 
CJLUll CllCUUlIlg LI AC 11151 poll Ul II1C ICaUCl pcpilUC Ul a V V U , V-.L/-J- 

or V -J-GENE or corresponding region in unspliced cDNA 


L-PART2 


j region oi v -fj,A.vjiN encouing me secona pan or leaner pepuue oi a v 
V-D-, V-D-J- or V- J-GENE or corresponding region in unspliced cDNA 




coaing region encouing me icatier pepuue in spiiceu cuin/\ 


L-V-D-J-C-REGION 


coding region including L-, V-, any D- and any N- REGION, J- and C- 

RFfUfYW in rTYMA 


L-V-D-J-C-SEQUENCE 


rearranged cDNA including L-REGION (or L-PART1 and L-PART2 for 
nncnlirpH rDNA^l V- D- T- anH f-RFOTON with S'UTR and "^'1 JTR 


L-V-D-J-REGION 


coding region including L-, any D- and any N- REGION, and J- 
REGION, in cDNA 


L-V-D-REGION 


coding region including L-, V- and any D- and any N-REGION, in 




partially rearranged cDNA including L-REGION (or L-PART1 and L- 

PAPT9 for nncnliopH ^n'^JA^ V anH Y\ PFnTfYW wnth <\1 TTP anH 

r /\xv i l ior unspnceu cuiNrt^, v- anu u- ivcvjivjin wun j uia anu 
3'UTR 


L-V-J-C-REGION 


coding region including L-, V-, J- and C- REGION, in cDNA 


L-V-J-C-SEQUENCE 


rearranged cDNA including L-REGION (or L-PART1 and L-PART2 for 
unspliced cDNA), V-, J- and C-REGION with 5'UTR and 3'UTR 


L-V-J-REGION 


coding region including L-, V-, and J- REGION, in cDNA 


L-V-REGION 


coding region including L- and V- REGION, in cDNA 


L-V-SEQUENCE 


germline cDNA including L-REGION (or L-PART1 and L-PART2 for 
unspliced cDNA) and V-REGION with 5' and 3'UTR 


LINKER 


short nucleotide sequence used to link 2 other nucleotide sequences 


M 


membrane exon of genomic C-GENE, or corresponding region in cDNA 


Ml 


1 st membrane exon of genomic C-GENE, or corresponding region in 
cDNA 


M2 

1 


2nd membrane exon of genomic C-GENE, or corresponding region in 
cDNA 

■ "' - \ 
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MISCFEATURE 


region oi Dioiogicai signiiicance mai cannot De uescriueu oy otner 
feature, EMBL Feature Key signification 


MISC RECOMB 


Miscellaneous recombination feature, EMBL FeatureKey signification 


MODIFICATION 


snows a mouiiicaiion or tne sequence or annotations compared to oiaer 
release of the sequence or similar sequences 


MUTATION 


A mutation alters the sequence here, EMBL Feature Key signification 


N-AND-D-J-REGION 


coding region including N-AND-D- and J-REGION, in rearranged 
genomic DNA or corresponding region in cDNA 




coding region encompassing the N diversity sequences and coding 
region of D-GENE(s) in rearranged genomic DNA, or corresponding 
region in cDNA 


JN-UJL. I V^VJo I Lif\ 1 1WJM- 

SITE 


poienuai in giycosyiauon sue encoaea oy tne motii Asp-A-oer/ inr 
where X is different from Pro 




coaing region encompassing tne in aiversity sequence 


Nl -REGION 


coding region encompassing the first N diversity sequence, when more 
man one jn-kjdoiun is involved 


N2-REGION 


coding region encompassing the second N diversity sequence, when 
more man one JN-KbOlUN is involved 


N3 -REGION 


coding region encompassing the third N diversity sequence, when more 
than one N-KLUiUN is involved 


N4-REGION 


coding region encompassing the fourth N diversity sequence, when 
more than one N-REGION is involved 


OCTAMER 


8 nucleotide regulation site or octanucleotide, in the 5'UTR of a V-, V- 
D-, V-D-J-, or V-J-GENE 


P-REGION 


region encompassing the P sequence 


PENTADECAMER 


1 5 nucleotide regulation site or pentadecanucleotide, in the 5'UTR of a 
V-, V-D-, V-D-J-, or V-J-GENE 


POLYA SIGNAL 


signal for cleavage & polyadenylation, EMBL Feature Key signification 


POLYASITE 


site at which polyadenine is added to mRNA, EMBL Feature Key 
signification 


PRIMER BIND 


non-covalent primer binding site, EMBL Feature Key signification 


PYR-RICH 


rich pyrimidic bases regulation site, genomic gene 


REPEAT UNIT 


one repeat unit of a repeat region, EMBL Feature Key signification 


SILENCER 


inhibitor signal for gene transcription, in genomic DNA 


TRANSCRIPT 


unspliced or spliced cDNA corresponding either to a L-V-SEQUENCE, 
JJ-obl^UbNUb, J-abl^ublNCb or J-L-bbyUbNCb in germane 
configuration, a L-V-D-SEQUENCE, D-J-SEQUENCE or D-J-C- 
SEQUENCE, or a C-SEQUENCE 


STOP-CODON 


codon which stops gene translation 


SWITCH 


switch sequence in the IGH locus 


Tata r^w 


TATA signal in eukaryotic promoters 


TRANSMEMBRANE- 
REGION 


coding transmembrane region 


UNSURE 


authors are unsure about the sequence in this region, EMBL Feature Key 
signification 
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T TTT? 


[untranslated sequence 


V-(DJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb, one D-J-UbNb and one U-UbNb 


V-(DJ)-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb ana one D-J-UbNb 


V-(DJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb, one D-J-UbNb, one J-UbNb and one C-UbNb 


V-(DJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb, one JJ-J-UbNb and one J -UbNb 


V-(VDJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
ObNb, one v-u-J-UbiNb ana one u-UbiNt 


V-(VDJ)-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb and one v-jj-j-ObJNb 


V-(VDJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
ubNb, one V -D-J-UbNb, one J-UbNb and one C-UbNb 


V-(VDJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE, one V-D-J-GENE and one J-GENE 


V-(VJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE, one V-J-GENE and one C-GENE 


V-(VJ)-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE and one V-J-GENE 


V-(VJ)-J-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb, one V-J-UbNb, one J-UbNb ana one L-UbNb 


V-(VJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
UbNb, one V-J-UbNb and one J-UbNb 


V-CLUSTER 


genomic DNA in germline configuration including more than one V- 
UbNb 


V-D-(DJ)-C-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE, one D-GENE, one D- J-GENE and one C-GENE 


V-D-(DJ)-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE, one D-GENE, one D- J-GENE 


V-D-(DJ)-J-C- 

LLUi I bK 


genomic DNA in rearranged configuration including at least one V- 

f* T^X TT? „„„ T\ /T^XTT? . _ IA T /Tr"?XTT7 T /T?XTT> 1 ft /^PVIF 

UbNb, one D-UbNb, one D-J-UbNb, one J-UbNb and one C-GENb 


V-D-(DJ)-J-CLUSTER 


genomic DNA in rearranged configuration including at least one V- 
GENE, one D-GENE, one D-J-GENE and one J-GENE 


V-D-EXON 


partially rearranged genomic DNA including L-PART2, V-, any D- and 
N- REGION 


V-D-GENE 


partially rearranged genomic DNA including L-PART1, V-INTRON 
and V-D-EXON, with the 5 UTR and 3 UTR 


V-D-J-C-CLUSTER 


genomic DNA in germline configuration including at least one V- 
GENE, one D-GENE and one J-GENE and one C-GENE 


V-D-J-C-REGION 


coding region including V-, any D- and N- REGION, J- and C- 
REGION, in cDNA 


V-D-J-CLUSTER 


genomic DNA in germline configuration including at least one V- 
GENE, one D-GENE and one J-GENE 


V-D-J-EXON 


rearranged genomic DNA including L-PART2, V-, any D- and N- 
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RFfrlfTN and I-RFGTON 


V-D-J-GENE 


rearranged genomic DNA including L-PART1, V-INTRON and V-D-J- 

cynM with thp VTTTR and VIITR 

C A. WIN , Willi UlC J U iJ\ allU J Ullv 


V-D-J-REGION 


r>r\A\nn rcwri nn inr>1i irl in n \/ anv/ anH T\JLR FfrTOM anH T-RFfiTON ifl 

coding region including v any yj anu in - ivcvj i wi > , aixu j ivl^ uiwi>, m 
rearranged genomic DNA, or corresponding region in cDNA 


V-D-REGION 


cooing region including v-, any u- aiiu in- rvi_-vj i vi n , in icanangtu 
genomic DNA or corresponding region in cDNA 


V-EXON 


germline genomic DNA including L-PART2 and V-REGION 


V-GENE 


oprmlinp CTpnnmir DNA inrliiHino T -PAP.T1 V-INTRON and V- 

EXON, with the 5'UTR and 3XJTR 


V -Her 1 /V1V1J3IV 


1 r\\\r-\f*r\\\(\f* rpnnmhinfltifin citf* lilff* f A l" 1 A {"tTO nart nf V-RS 
/ nUwlCOLlUC ICUUlIll/lllallUIl &11C, 11JVC v^/ / vvj 1 VJ , ya.il \jl v 


V-INTRON 


nr»r» r-nHino cpnnpnrf hptwfpn T -PARTI and V-F^fON in ppnomic 
I1UI1 t/Uulllg sequence uciwcciL Lj~r r\r\. i i cum v ■"i_ t ^v\_/i> , in g,oii\_>i 

DNA, or corresponding sequence in imspliced cDNA 


V-J-C-CLUSTER 


gcnumiL' uiNn in gcrniiiiic uuiiiiguiaiiuii ui^iuunig ax icooi ui«j v 

GENE, one J-GENE and one C-GENE 


V-J-C-REGION 


coding region including V-, J- and C- REGION, in cDNA 


V-J-CLUSTER 


rrannmlr* T"Y\T A in rrormlino rrinft nurcitirtn incliiHinCT Sit ipact nnp V-irHNr, 

genomic uin/\ m gcrmiuic Luiiuguiauun mnuuing ai icaoi uuc v vjj_^i > i_j 
and one J-GENE 


V T PYAXI 


roorr'inrrpH fr^nnmi/' l~^TvJ A inrlnHincr T -PAR \") \I - ctnH T- RFitTON " 

rearranged genomic ivi > r\ including L-rnivi ^, v aiiu j ivl^vj ivi> 


V-J-GENE 


rearranged genomic DNA including L-PART1, V-INTRON and V-J- 

FYfYW with tn*» S'TTTR anH VTTTR 
CW^iN, wim U1C J U llv add J U1I\ 


V-J-REGION 


cooing region including v- dud j-ivcvji^in, in icaiiangcu genuine 
DNA, or corresponding region in cDNA 


V-LIKE-DOMAIN 


coding region of non-IG and non-TR similar to an IG or TR V- 
jjumahn 


V-NONAMER 


9 nucleotide recombination site, like ACAAAAACC, part of V-RS 


V-REGION 


coding region of V-GENE without the leader peptide (plus 1 or 2 
nucleotide(s) before the V-HEPTAMER, if present), or corresponding 
region in cl/in/^ 


V-RS 


recombination signal including V-HEPTAMER, V-SPACER and V- 
NONAMFR in V of V-RFGTON of a V-GFNF or V-SEOUENCE 


V-SPACER 


12 or 23 nucleotide spacer between the V-HEPTAMER and the V- 
NONAMER of a V-RS 


VARIATION 


a related population contains stable mutations, EMBL Feature Key 
signification 


scFv 


defines two immunoglobulin (or by extension T cell receptor) V- 
DOMAINs covalently linked by a short linker peptide in vitro 



4. FEATURE LOCATION 



The second item on the FT line designates the location of the feature in the sequence. The location 
begins at column 26. Several conventions are used to indicate sequence location. 

Base numbers in locations refer to the numbering in the entry, which is not necessarily the same as 
the numbering scheme used in the original report. The first base in the presented sequence is 
numbered base I. Sequences are presented in the 5' to 3' direction. 
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A location can be one of the following: 
o A single base, 
o A contiguous span of bases. 

A contiguous span of bases is indicated by the number of the first and last bases in the range 
separated by two periods (e.g., 23.. 79). Starting and ending positions can be indicated by base 
number. 



5. FEATURE QUALIFIERS 

Qualifiers provide additional information about features. They take the form of a slash (/) followed 
by a qualifier name and, if applicable, an equal sign (=) and a qualifier value. Feature qualifiers 
begin at column 26. 

Qualifiers convey many types of information. Their values can, therefore, take 
several forms: 

o Free text. 

o Controlled vocabulary or enumerated values, 
o Citations or reference numbers. 



o Sequences. 

o Feature labels. 

Text qualifier values are enclosed in double quotation marks. The text can consist of any printable 
characters (ASCII values 32-126 decimal). If the text string includes double quotation marks, each 
double quotation mark must be escaped by placing a double quotation mark in front of it 
(e.g., /note-'This is an example of ""escaped"" quotation marks'*). 

Citation or reference numbers for an entry are enclosed in square brackets ([]) to distinguish them 
from other numbers. 



A literal sequence of bases (e.g., "atgcatt") is enclosed in quotation marks. Literal sequences are 
distinguished from free text by context. Qualifiers that take free text as their values do not take literal 
sequences, and vice versa. 

The '/label-' qualifier takes a feature label as its qualifier. Although feature labels are optional, they 
allow unambiguous references to features. The feature label identifies a feature within an entry; 
when combined with the accession number and the name of the data bank from which it came, it is a 
unique tag for that feature. 



The following is a list of valid feature qualifiers: 



Qualifier 


Description 


allele 


Name of the allele for the a given gene | 


allotype 


polymorphic extracellular marker detected by serological methods and 1 
present in different individuals of the same species | 


AA IMGT 


Amino Acid numerotation in the sequence according to IMGT | 


AA number 


Amino Acid numerotation in the sequence | 




1 
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cell line 


Cell line from which the sequence was obtained 


cell tvne 


Cell type from which the sequence was obtained 


chromosome 


Chromosome (e.g. Chromosome number) from which the sequence was 
obtained 


citation 


Reference to a citation listed in the entry reference field 


clone 


Clone from which the sequence was obtained 


clone lib 


i^ione iiDrary rrom wmcn tne sequence was oDiainea 


codonstart 


Indicates the offset at which the first complete codon of a coding feature can 
be found, relative to the first base of that feature 


cons_splice 


Differentiates between intron splice sites that conform to the 5'-GT ... AG-3' 
splice site consensus 


country 


Country of origin for DNA sample, intended for epidemiological or 
population studies 


CDRJength 


Number of Amino Acids in CDRUMGT, CDR2-IMGT, CDR3-IMGT, 
separated oy dots, ana snown m oracKeis, a is usea ior paniai or duseni. 
CDR * 


db xref 


Jjatabase cross-reierence. pointer to related lniormauon in anomer udiaudbc 


dev_stage 


It tne sequence was ootainea rrom an organism in a specinc developmental 
stage, it is specified with this qualifier 


evidfrtff* 


Value indicating the nature of supporting evidence, distinguishing between 
experimentally determined and theoretically derived data 


function 


Function attributed to a sequence 


nrrlh Yrpf 


(ifrifwip DatahanW unintie ID cro*?s reference Qualifier 


gene 


Symbol of the gene corresponding to a sequence region 


genealias 


other gene name in the litterature 


germline 


Denotes that the sequence is from immunoglobulin or T cell receptor 
unrearranged DNA or RNA 


1 * _ _c _ 

germlme_frame 


Translation arbitrarily shown in the germline reading frame, for J-REGION 
(and C-REGION in cDNA) of unproductive (genomic or cDNA) rearranged 
sequences 


haplotype 


Haplotype of the organism from which the sequence was obtained 


insertion_seq 


Insertion sequence element from which the sequence was obtained 


in frame 


No frameshift in the JUNCTION 


isolate 


Individual isolate from which the sequence was obtained 


IMGT BAC clone |Name of the BAC clone from which the sequence is derived 


IMGT cell line ||Name of the cell line from which the sequence is derived 


IMGT cosmid clone jName of the cosmid clone from which the sequence is derived 


IMGT MAC clone ||Name of the MAC clone from which the sequence is derived 


IMGT note [comment added by the LIGM curators to the IMGT feature 


IMGT_phage clone ||Name of the phage clone from which the sequence is derived 


IMGT_plasmid_clone 


Name of the plasmid clone from which the sequence is derived 


IMGT YAC clone 


Name of the YAC clone from which the sequence is derived 


label 


A label used to permanently identify a feature 


lab_host 


Laboratory host used to propagate the organism from which the sequence 
was obtained 




i 1 
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map 

£. 1 


Genomic map position of feature 


nomgen 


Name of the gene corresponding to a sequence region 




Anv rnmment or additional information 




A number to indicate the order of eenetic elements (e.£.. exons or introns) in 
the 5* to 3' direction 


organism 


The scientific name of the organism that provided the sequenced genetic 
material 


out of frame 


frameshift in the JUNCTION 


nartial 


Differentiates between complete regions and partial ones 


product 


Name of a product encoded by the sequence 


protein_id 


Protein Identifier, issued by International collaborators. This qualifier 
consists of a stable ID portion (3+5 format with 3 position letters and 5 
numbers) plus a version number after the decimal point. 


pseudo 


Indicates that this feature is a non-functional version of the element named 
by the feature key 


putative_limit 


Refers to uncertain limit(s) of a subregion 


PCR conditions 


Description of reaction conditions and comoonents for PCR 


1 vsd-1 1 till t^lsU 


Denotes that the sequence is from immunoglobulin or T cell receptor 
rearranged DNA or RNA 


replace 


indicates that the sequence identified by a feature's intervals is replaced by 
the sequence shown in "text" 


rpt_family 


Type of repeated sequence; Alu or Kpn, for example 


rpt_type 


Organization of repeated sequence 


rpt_unit 


Identity of repeat unit that constitutes a repeat region 


sequenced_mol 


Molecule from which the sequence was obtained 


sex 


Sex of organism from which the sequence was obtained 


specificity 


Specificity of an immunoglobulin or T cell receptor chain 


specifichost 


Natural host from which the sequence was obtained 


standard name 


Accepted standard name for this feature 


strain 


Strain from which the sequence was obtained 


sub clone 


Sub-clone from which the sequence was obtained 


subspecies 


Sub-species name of organism from which the sequence was obtained 


sub strain 


Sub-strain from which the sequence was obtained 


tissue lib 


Tissue library from which the sequence was obtained 


tissue type 


Tissue type from which sequence was obtained 


translation 


Automatically generated one-letter abbreviated amino acid sequence of the 
coding regions 


transl_except 


Translational exception: single codon the translation of which does not 
conform to genetic code defined by Organism and /codon 



This manual and the database it accompanies may be copied and redistributed freely, without 
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Last modified: Juin 2003 

Software material and data coming from IMGT server may be used for academic research only, 
provided that it is referred to IMGT, and cited as "IMGT, the international ImMunoGeneTics 
database http://im gt.ci nes.fr: 8 1 04 (Initiator and coordinator: Marie-Paule Lefranc, Montpellier, 
France)." References to cite: Lefranc, M.-P. et al., Nucleic Acids Research. 27. 209-212 (1999) : 
Ruiz, M. et aL, Nucleic Acids Research. 28. 219-221 (2000) Lefranc, M.-P., Nucleic Acids 
Research. 29 . 207-209 ( 2001) , Nucleic Acids Res.. 31. 370-310 (2003 ) Full text . 

For any other use please contact Marie-Paule Lefranc lefrancfgjigm.igh.cnrs.fr . 



IMGT initiator and coordinator: Marie-Paule Lefranc ( lefrancfoiligm.igii.cnrs.fr ') 
Bioinformatics manager: Veronique Giudicelli f giudi@.ligm. igh.cnrs.fr) 
Computer manager: Denys Chaume ( Denys.Chaume@.igh.cnrs.fr ) 
Interface design: Chantal Ginestoux ( chantalfg>ligm. igh.cnrs.fr ) 

© Copyright 1995-2003 IMGT, the international ImMunoGeneTics database 
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